NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

PADS: Power Budgeting with Diagonal Scaling for Performance-Aware Cloud Workloads

https://doi.org/10.1109/IGSC64514.2024.00012

Savasci, Mehmet; Souza, Abel; Irwin, David; Ali-Eldin, Ahmed; Shenoy, Prashant (November 2024, IEEE)

Cloud platforms’ rapid growth raises significant concerns about their electricity consumption and resulting carbon emissions. Power capping is a known technique for limiting the power consumption of data centers where workloads are hosted. Today’s data center computer clusters co-locate latency-sensitive web and throughput-oriented batch workloads. When power capping is necessary, throttling only the batch tasks without restricting latency-sensitive web workloads is ideal because guaranteeing low response time for latency-sensitive workloads is a must due to Service-Level Objectives (SLOs) requirements. This paper proposes PADS, a hardware-agnostic workload-aware power capping system. Due to not relying on any hardware mechanism such as RAPL and DVFS, it can keep the power consumption of clusters equipped with heterogeneous architectures such as x86 and ARM below the enforced power limit while minimizing the impact on latency-sensitive tasks. It uses an application-performance model of both latency-sensitive and batch workloads to ensure power safety with controllable performance. Our power capping technique uses diagonal scaling and relies on using the control group feature of the Linux kernel. Our results indicate that PADS is highly effective in reducing power while respecting the tail latency requirement of the latency-sensitive workload. Furthermore, compared to state-of-the-art solutions, PADS demonstrates lower P95 latency, accompanied by a 90% higher effectiveness in respecting power limits.
more » « less
Full Text Available
SLO-Power: SLO and Power-aware Elastic Scaling for Web Services

https://doi.org/10.1109/CCGrid59990.2024.00025

Savasci, Mehmet; Souza, Abel; Wu, Li; Irwin, David; Ali-Eldin, Ahmed; Shenoy, Prashant (May 2024, IEEE)

Full Text Available
DDPC: Automated Data-Driven Power-Performance Controller Design on-the-fly for Latency-sensitive Web Services

https://doi.org/10.1145/3543507.3583437

Savasci, Mehmet; Ali-Eldin, Ahmed; Eker, Johan; Robertsson, Anders; Shenoy, Prashant (April 2023, ACM Web Conference)

Traditional power reduction techniques such as DVFS or RAPL are challenging to use with web services because they significantly affect the services’ latency and throughput. Previous work sug- gested the use of controllers based on control theory or machine learning to reduce performance degradation under constrained power. However, generating these controllers is challenging as ev- ery web service applications running in a data center requires a power-performance model and a fine-tuned controller. In this paper, we present DDPC, a system for autonomic data-driven controller generation for power-latency management. DDPC automates the process of designing and deploying controllers for dynamic power allocation to manage the power-performance trade-offs for latency- sensitive web applications such as a social network. For each application, DDPC uses system identification techniques to learn an adaptive power-performance model that captures the application’s power-latency trade-offs which is then used to generate and deploy a Proportional-Integral (PI) power controller with gain-scheduling to dynamically manage the power allocation to the server running application using RAPL. We evaluate DDPC with two realistic latency-sensitive web applications under varying load scenarios. Our results show that DDPC is capable of autonomically generating and deploying controllers within a few minutes reducing the ac- tive power allocation of a web-server by more than 50% compared to state-of-the-art techniques while maintaining the latency well below the target of the application.
more » « less
Full Text Available

Search for: All records